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' The Stormer-Verlet-leapfrog group of integrators commonly used in molecular dynamics simu- 

' lations has long become a textbook subject and seems to have been studied exhaustively. There 

are, however, a few striking effects in performance of algorithms which are well-known but have not 
received adequate attention in the literature. A closer view of these unclear observations results 
in unexpected conclusions. It is shown here that contrary to the conventional point of view, the 
leapfrog scheme is distinguished in this group both in terms of the order of truncation errors and 
the conservation of the total energy. In this case the characteristic square growth of fluctuations of 
the total energy with the step size, commonly measured in numerical tests, results from additional 
interpolation errors with no relation to the accuracy of the computed trajectory. An alternative pro- 
(~| I cedure is described for checking energy conservation of leapfrog-like algorithms which is free from 

^ |. interpolation errors. Preliminary tests on a representative model system suggest that standard step 

JL ' size values used at present are lower than necessary for accurate sampling. 

O ■ I. INTRODUCTION 

O 

Q ' The well-known group of integrators for the Newton's equations comprising Verlet ^ , leapfrog |^ , velocity Verlet 
0' ^^"^ Beeman Q methods play a central role in the classical methodology of molecular dynamics. Due to their 
simplicity and exceptional stability they proved to be the best choice for long time - large step size calculations and 
r'^ they are now employed in the great majority of simulations of large systems. All or some of these methods are 
O i' always described in detail and compared in any modern textbook |^,]^,^. It is well-known that they are based upon 
. a ninety- year-old Stormer time-centered difference approximation of accelerations 1^,^ , that given appropriate initial 
I ■ conditions they produce the same trajectory in coordinate space and that, therefore, they all represent variations of 
^ , a single algorithm. 

00 ' Despite this long prehistory, the performance of these algorithms still attracts remarkable attention, first, because 
] of their practical importance and, second, because understanding of the origin of their exceptional properties may help 
i to develop even better algorithms. In particular, their stability has been attributed to the correspondence between 
' the order of the finite difference equation and its analytical analog 0, to their time reversibility [pPJlO[|, to oscillating 
fundamental solutions Jill and to their symplectic property A hallmark of these method is a square power 

growth of global errors with the step size, which is invariably observed with different testing techniques and can even 
be used for debugging computer codes ||^ . Because of the low apparent order of approximation they usually appear 
^ ' less accurate than other methods in comparative studies, but, with large step sizes, when other algorithms loose 
stability, the leapfrog scheme and its analogs still produce trajectories with very low total energy drift and correct 
average static and dynamic properties ||l4|-|l6|. These observations are known and exploited for such a long time 
_ that it seems to have been forgotten that actually they are quite unusual and that this behavior still has no clear 
rS |. explanation. 

The observed persistence of averages and a low drift of the total energy might mean that the numerical trajectory 
computed with a large step size, though inaccurate in a strict sense, holds well to the correct constant energy 
hypersurface in phase space. It is well-known, however, that compared with other integrators the Stormer-equivalent 
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, algorithms conserve the instantaneous total energy rather poorly - its value already strongly fluctuates at time steps 
. - - 1 well below the theoretical limit of stability. Thus, it appears that the trajectory constantly deviates from the initial 
hypersurface, but always finds a way back, which is rather striking because, for example, the velocity Verlet algorithm 
is self-starting and, consequently, keeps no memory of the preceding part of the trajectory. One explanation to these 
observations follows from the general property of symplectic integrators which, with a sufficiently small time step, 
generate discretizations of exact trajectories corresponding to perturbed Hamiltonians |p^ . This property, however, 
holds for small time steps only and it does not explain why the leapfrog-equivalent algorithms are distinguished among 
symplectic integrators as well. 

The above contradiction may be settled if one assumes that the total energy is actually conserved better than 
it appears. For a Stormer or a leapfrog trajectory this is possible, in principle, because in these cases the total 
energy is not a well-defined quantity. The original Stormer formula [n]jq| needs only coordinates and accelerations 
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for computing a trajectory and so, strictly speaking, velocities are not defined. There are many methods to evaluate 
velocities and kinetic energies but they all employ some finite difference approximations and, usually, interpolations, 
which is often done implicitly, as in the case of the velocity Verlet and Beeman algorithms . ft is understood that 
these computations may introduce additional errors into the computed total energy, but these errors are difficult to 
evaluate or to get rid of and one cannot generally tell how large is their relative contribution. 

The initial goal of the present study was to check more accurately how well a leapfrog trajectory holds to a constant- 
energy hypersurface, when only approximations intrinsic to the algorithm play a role and there is no influence of 
additional approximations and interpolations. As happens sometimes, the solution of a practical task has lead to a 
comprehensive "06 initid^ analysis of several related problems which have been unrecognized or simply ignored in 
the literature. It is shown here that contrary to the conventional point of view, the leapfrog scheme is more accurate 
than other algorithms of this group, both in terms of the order of the truncation errors and the conservation of the 
total energy. The observed square growth of fluctuations of the total energy with the step size appears to result 
from interpolations only and has no relation to the accuracy of the computed trajectory. An alternative procedure is 
proposed for checking energy conservation of leapfrog-like algorithms which is free from interpolation errors. Testing 
on an example of protein dynamics confirms that the leapfrog trajectory really manages to sample from a correct 
hypersurface in phase space with larger step sizes than commonly recommended. 

When a classical text-book subject is discussed it is difficult to maintain the logic of argumentation without 
reproducing some well-known general results. The author hopes, therefore, to be excused by experts if some of the 
derivations in the text appear trivial. 



II. RESULTS AND DISCUSSION 



We will consider Newton's equation for a particle of a unit mass 

i^fix) (1) 

where the dot notation denotes time derivatives. The Stormer fourth order algorithm (the order of the algorithm here 
and below refers to the order of the truncation error) first introduced in this field by Verlet [Q is 

Xn+l = 2Xn - Xn-1 + a„/l^ (2) 

where x„ and a„ are the coordinate and the acceleration at the nth time step and h is the step size. The leapfrog 
scheme popularized by Hockney and Eastwood is 

Vr,+ i =v,^_i+a„h (3a) 

Xn+l = Xn + W„+i (3b) 

where w„_|_i is the half-step velocity. The Swope et al. |^ formulation of the algorithm referred to as velocity Verlet 
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Xn+l = Xn + (^Vn + a„^^ ft, (4a) 
Vn+1 ^ Vn + (a„ + a„+i) - (4b) 



Finally, the Beeman method 



Xn+l = Xn + Vnh + |^-a„ - g""-l j (^^) 

Vn+1 = w„ + Qa„+i + ^a„ - ^fln-i^ h. (5b) 

Instructive discussions of the relationships between these methods can be found elsewhere P,^| Jl0|Jl7| , [l8|| . The algo- 
rithms defined by Eqs. (2)-(5) significantly differ in the implicitly adopted point of view upon the definition of the 
on-step velocities and, consequently, the total energy. For the velocity Verlet and Beeman integrators the on-step 
velocity Vn-, kinetic energy Kn and the total energy En are well defined quantities. In contrast, for both the leapfrog 
and the original Stormer algorithms no complete trajectory in phase space is computed and one can choose between 
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different interpolation formulae for the calculation of on-step velocities or kinetic energies. Because of the reasons 
given below it is most convenient for us to take the second choice. Let us consider that our trajectory is computed 
by the leapfrog integrator (3a, b). The coordinates thus obtained automatically satisfy Eq. (2). The corresponding 
solution of Eqs. (4a, b) is obtained if we employ the interpolation formula 



and the solution of Eqs. (5a, b) with 
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li^n-i+Vn+l) (6) 



2w„, 1 +5u„^i -w„_i) (7) 



It should be noted that since the Beeman algorithm employs coordinates and velocities from several time steps it is 
equivalent to the leapfrog algorithm with interpolation (7) only if its initial conditions match Eqs. (3a, b). 

When analytical trajectories are described it is common to use the term "state" to refer to a point in phase space. 
From any state the analytical trajectory can be continued in a unique way in both time directions. For the following 
discussion, it will be convenient to use a similar terminology referring to numerical trajectories. Thus, a leapfrog 
state is defined as a pair (a;„, d„_i/2) , a velocity Verlet state as (x„,w„), and a Stormer state as (a;„,a;„_i). In each 
of these three cases there exists a unique analytical solution of Eq. (1) to which these states belong although the 
corresponding exact trajectories are not identical. 



A. The Order of Approximation of The Leapfrog Scheme 



This very basic and apparently simple question appears to be very confused and persistently misinterpreted in the 
literature. Fundamental textbooks always demonstrate that algorithm (2) is accurate to fourth order but add that 
velocities are usually computed with a lower accuracy. Equivalence between all integrators of this group also appears 
only when low-order approximations are used for velocities [p^ . At the same time, Eqs. (3a, b) and the velocity Verlet 
and Beeman algorithms are only O (ft."^) accurate. In various numerical tests all these integrators exhibit an O (ft.^) 
growth of global errors in agreement with O (^h^) order of the local truncation error 1^,^. This occurs even when no 
lower-order approximations are involved, for instance, if Eq. (2) is used and the deviation of coordinates from an 
analytical solution is checked [Q. Although these observations are not contradictory they are rather perplexing, but, 
however surprising, they are essentially ignored in the literature. The confusion clearly results from approximations 
of velocities which are not included in Eq. (2), but since all these algorithms can produce the same trajectory the 
difhculty is only formal and seems to present no practical interest. We will now see that this reasoning involves a 
conceptual error with important consequences. 

Consider the standard derivation of Eq. (2) , which starts from two Taylor series from time t in opposite directions 

x{t + h):^x{t)+x{t)h+ -it (t) + -X {t) + O (h^) (8a) 

2 6 

x{t^h) = x{t)-x{t)h+ -X (t) - -X (t) h^ + (/i^) (8b) 

2 6 

Adding these two expansions gives Eq. (2), with all odd-order terms eliminated, and a resultant truncation error of 
O (/i"*) . Now let us perform a similar derivation for half-step velocities, which gives 

and by substituting 

^[t-^)-ll^W-^it-h)] + 0{h^) (10) 

we get a fourth order finite difference equation 

Vn+i = 2u„_ 1 - i;„_| -I- (a„_i - a„) (11) 

It is readily seen that any trajectory produced by the leapfrog scheme (3a, b) satisfies this equation, so it is an exact 
analog of the Stormer formula (2) for velocities. We see, therefore, that the order of approximation of the leapfrog 
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scheme is four rather than three, both for coordinates and velocities. Eqs. (2) and (11) are common fourth order 
predictors which use values from several time steps to reduce the truncation error. The fact that the same trajectory 
can be generated by lower-order Eqs. (3a, b) results from a fortunate cancelation of errors. 

Note, however, that Eqs. (2), (11) and Eqs. (3a, b) are not equivalent. Eq. (11) provides a leapfrog solution 
only if initial conditions match Eq. (3a). Note also, that Eqs. (2) and (11) are coupled only via accelerations - no 
relationship between coordinates and velocities is implied a priori. Therefore, starting from arbitrary initial conditions 
a completely non- leapfrog trajectory may be produced. The relationship between the general solutions of Eqs. (2), 
(11) and (3a, b) presents an interesting question, but it is beyond the scope of the present paper. 



B. Accumulation of Errors Along A Leapfrog Trajectory 

The truncation error is in fact only a part of the story because single-step errors can accumulate thus producing 
so called global errors which are relevant for assessment of an algorithm and which, as noted above, exhibit a 

characteristic O (/i^) growth. Let us check how, in the case of a leapfrog trajectory, truncation errors accumulate for 
conventional testing procedures. 

Suppose we have a trajectory of duration T; we compute some parameter Q [x (ti) , v {ti)] at M time steps i — 1, M 
and evaluate 



M 1 ^ 



(12) 



Here Qq (ti) is the corresponding analytical value computed for an exact trajectory starting from the initial state. An 
overbar here and below denotes time averaging. Now we have 



J=o 



(13) 



where 



AAQ,- 



AQj+i - AQj - Q {t,+i) - Q (t,) - Qg (t^+i) + Q°- (i,) = 
[Q (tj+i) - itj+i)] + [Q^ (tj+i) - Q (t,)] - [Qg (t,+i) - QS (<,)] 



(14) 



where (t) refers to the analytical trajectory starting from jth state in the numerical trajectory. Here only the first 
bracket in the r.h.s. is contributed by the local truncation error while the rest results from deviations of analytical 
solutions corresponding to different initial conditions. Discarding higher order terms one can rewrite Eq. (14) as 



AAQj 



(15) 



where aj are constant coefficients and, for the leapfrog scheme, m — A. 

In the general case both terms in the r.h.s. of Eq. (15) must be taken into consideration. Let us check the case 
when Q is just the coordinate and so Eq.(12) estimates deviation from an analytical trajectory. Let Xj {t) and a;°_|_i {t) 
be two analytical solutions of Eq. (1) with boundary conditions 



x'^ {t - h) ^ Xj^i; x°j{t)^Xj 
a^j+i (0 = ; x^+i {t + h) = Xj+i 

where and Xj^i are successive points on a numerical trajectory. We have 



(16a) 
(16b) 



^"+1 {t + h)= X, + {t)h + f {x,) ■— + i]^^ {t) ■— -f {t)'^ + (h^) 



x^{t^h) = x,-x^{t)h + fix,)-- ■".-24 
Taking into account that Xj-i, Xj and Xj+i are related by Eq. (2) summation of Eqs (17a, b) yields 



(17a) 
(17b) 



2l + O{h')=0 



(18) 
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Thus, except for the special case ij+i {t) = i° (t) we have 

it) - i-^ it) = O (h^) (19) 
Assuming that x"^ (t) are smooth enough, relation (19) can be continued over a finite time interval and we may write 

where /3fc are constant coefficients and /3(tj) results from averaging over interval (0,tj) for a sufficiently small step 
size. Thus in Eq. (15) the second term appears to be O [h^) and it dominates over the truncation error given by 
the first term. Now, by performing the summation in Eq.(13) similarly to Eq. (20), we see that the deviation of the 
leapfrog trajectory from an analytical solution must be of the order of O (/i^) in agreement with usual conclusions 
[0 . We see, therefore, that the square growth of the deviation of the coordinates from an analytical trajectory has 
a more complicated origin than the simple accumulation of truncation errors and it does not contradict the O (/i^) 
truncation error of the algorithm. 

Now let us consider the case when Q is the total energy. For any analytical solution its time derivative is zero and 
AAEj in Eq. (15) appears to be O (h^). The same result is obtained with the average total energy instead of the 
analytical value in Eq. (12). Summation in Eq. (13) now gives the global error on the order of O (h^), that is one 
order higher than the common conclusion Pjl^]. There is one complication, however, which was missed in the above 
derivations. We tentatively assumed in Eq. (14) that for any state on a numerical trajectory one can find a reference 
analytical solution passing through this state. For the next-step total energy to deviate as O (/i"*) both coordinates 
and velocities must be within O (ft-^) of this reference trajectory, which, in turn, requires that it pass through both 
(xmXn-i) and ('y„_i/2, i'n-3/2) according to Eqs. (2) and (11), respectively. However, since Eq. (1) is only of second 
order, such an analytical solution does not necessarily exist. In the general case, we have two different analytical 
trajectories corresponding to coordinates and velocities and the overall single-step error in the total energy has a 
contribution from incoherence between these trajectories. On the other hand, if we use as a reference an analytical 
trajectory passing through a leapfrog state or a velocity Verlet state we obtain O (/i"^) single-step deviation and O {h?) 
global error, but this is only an upper estimate because it does not take into account the cancelation of local errors 
which allows solutions of Eqs. (3a,b) to fit higher order equations (2) and (11). We conclude, therefore, that the 
global error in the total energy for a leapfrog trajectory must be between O {h?) and O (h^), but it may depend upon 
the specific properties of Eq. (1). That is why the order of global errors in this case cannot be estimated analytically 
a priori, and should rather be measured numerically. We will see in the next section, however, that this appears 
difficult. 



C. Effects of Interpolations of Kinetic Energy 

We are going to show here that the common approach to testing energy conservation in molecular dynamics in the 
case of the leapfrog scheme in fact fails to evaluate the true error of the algorithm because of a dominating contribution 
of additional interpolations necessary to compute on-step kinetic energies. Following many previous studies, we first 
consider a simple harmonic oscillator as a model case suitable for analytical treatment. This simple model appears 
to have a significant predictive power because the fastest motions in real systems are nearly harmonic. We will check 
this on a more realistic example of protein dynamics. 

Consider an oscillator with the Hamiltonian 

H^^{x'+Lo'x^) (21) 

and the leapfrog equations of motion 

v^+i - v„_i = -uj^xnh (22a) 

X„+i -Xn ^ ^n+i'i (22b) 

It is known that these finite difference equations have an analog of the total energy |^]. By multiplying Eq. (22a) by 
(w„+i/2 + Vn-1/2) with simple algebra one obtains 

^ (^n+i + (^"^XnXn+i^ = ^ (^^- i + w^a;„_ix„^ = £i (23) 
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Similarly, by multiplying Eq (22b) by {xn+i + Xn) we obtain 



1 + 



2 2 

to X„ 



2 2 
1 + W X„_i 



^ £2 



(24) 



and it is easy to check that £i = £2 = £• Thus, there is a quantity with the dimension of energy which is exactly 
conserved along the numerical trajectory. In the general case a numerical "first integral" analogous to e does not exist, 
but it is very useful in analytical derivations and gives the possibility to reveal certain qualitative features intrinsic 
in the common approaches to the estimation of the on-step total energy. Consider the most common approximation 
of the on-step total energy for the leapfrog scheme 



I 2 2 



(25) 



where t = u>h is the reduced step size. Here we made a substitution of Eqs. (22a) and (24) and denoted potential 

energy as As expected, the total energy is not constant and it fluctuates with a small amplitude on the order of 
O (r^). The relative fluctuation, sometimes used as a convenient indicator of the accuracy of the trajectory, is simply 



Qif = 



DjEif] 
D[U] 



T 

y 



(26) 



where D [ ] denotes the operator of variance. The fluctuating term in Eq. (25) is usually used to check for the accuracy 
and stability of a computed trajectory. One can note, however, that the analytical form of this term is somewhat 
suspicious. Namely, it could have been expected that a numerical fluctuation of an analytically constant value has 
a more complicated form than just a scaled oscillation of the potential energy. In order to make clear the origin of 
this oscillation let us consider the related value for an exact analytical trajectory of an oscillator with a; (0) = and 
v{0) = CO. We have 



COS^ LO i t — ^1 + COS^ Ul ( 



sin (jjt 



En. 1 



(27) 



where Ea and U (t) are analytical total and potential energies, respectively. Comparison of this result with Eq. (25) 
shows that the O (r^) oscillation is exactly same for the numerical trajectory as for the analytical one suggesting 
that it is introduced by the interpolation and has no relation to the accuracy of the trajectory. In order to validate 
this suggestion let us check what one gets when more accurate interpolation formulas are employed. Consider the 
interpolation 



(28) 



where Kn-1/2 denotes the half-step kinetic energy. It is not difflcult to check that when applied to an exact trajectory 
it gives an O {t^) oscillation around the correct total energy. For the leapfrog solution, derivations similar to that 
used for Eq. (25) result in 



E^=e[l + —]+ —ujx„v. 



(29) 



Again we see that the amplitude of the oscillation scales as the order of the interpolation. Note that its phase is 
shifted by 7r/2 from that of Eif. For the following discussion we need one more interpolation 



which gives 



(30) 



(31) 



with O (r**) oscillation, as expected. E4 oscillates in phase with the kinetic rather than with potential energy, which 
means that it is shifted by tt from Eif. 
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One might conclude that in this special case the total energy is conserved exactly with any time step size. We 
will later see, however, that actually the leapfrog trajectory deviates from an exact constant energy hypersurface, 
but these deviations are regular and they appear to be compensated by the approximation errors of the interpolation 
formulas. This occasional cancelation certainly results from the fundamental solutions of the finite difference equation 
(2) being sine and cosine functions j^, but it is rather instructive because it demonstrates that interpolations can 
produce very misleading effects. The most interesting for us, however, is the form of the oscillations produced by the 
interpolation formulas. We will now see that qualitatively similar behavior is observed in real simulations. 

Figs. 1-3 present a 50 fsec interval of a molecular dynamics trajectory computed with three different time steps: 
0.5 fsec, 0.1 fsec and 0.05 fsec, respectively. The system modeled consists of an immunoglobulin binding domain 
of streptococcal protein G jl^ which is a 56 residue a/ (3 protein subunit (file Ipgb in the Protein Database [ po[ ) 
with 24 bound water molecules presented in the crystal structure. All hydrogens are considered explicitly thus giving 
927 atoms. The trajectory was computed without constraints upon the protein structure and with TIP3P water 
molecules held rigid. Other details of the simulation protocol are given in Section II D below. In such a system the 
fastest oscillations are due to bond stretching of hydrogens with a period of about 10 fsec, which is clear in the time 
profiles of the potential energy shown in Figs. la-3a. Figures lb-3b show the time profile of the total energy computed 
by Eq. (25). Note that according to Eqs. (25), (29) and (31) the amplitude of the oscillation of the total energy grows 
with the frequency, which means that, in systems of many oscillators, interpolations tend to filter lower frequencies. 
It is not surprising therefore that in Figs. lb-3b the highest frequency strongly dominates compared with Figs. la-3a. 
Note, however, that in all three figures Eif oscillation has exactly the same phase as that of the potential energy. 

The profiles of the potential energy in Figs, la - 3a are indistinguishable, which means that with a step size of 0.5 
fsec the trajectory is perfectly accurate. In agreement with the simple case considered above the shapes of the profiles 
of the total energy in Figs. lb-3b also appear to be indistinguishable, with the amplitude of the oscillation scaled 
as the square of the step size. The identity of the three profiles in this case is not evident a priori and it is difficult 
to explain unless the above observations for an oscillator are not invoked. Thus, despite the simplicity of the model 
employed, Eq. (25) very accurately predicts both the growth rate with the step size and the shape of the fluctuation 
of the total energy in a much more complex system. 

Now consider the time profiles of the total energy computed with higher order interpolations. Figs. Ic and Id 
show that both E-^ and Ej^ behave similarly to Eif and as predicted by Eqs (29) and (31). Note that the oscillations 
of E^ and E^ are shifted by 7r/2 and tt, respectively, from Eij in Fig. lb. With a reduced step size, however, both 
i?3 and E^ reveal some new features. Similarly to Eif the E^ oscillation in Figs. 2c and 3c remains dominated by 
high-frequency oscillation, although its overall profile slightly changes. It is easy to see that the amplitude of the 
high-frequency oscillation scales as O {h?^ and that the change in the profile is likely to be due to an underlying 
fluctuation which scales slower than O [h^) . Finally, for E4 oscillation in Figs. 2d and 3d we observe a profile of the 
fluctuation qualitatively different from those in Figs, l(a-d). The amplitude of the oscillation in Fig. Id reduced by 
factors of 625 and 10** in Figs. 2d and 3d, respectively, would present a negligible part of the remaining fluctuation 
and that is why this hidden profile is revealed. This residual deviation of the total energy can be attributed to the 
interpolation-free error of the trajectory itself. Comparison of Figs. 2d and 3d shows that the amplitude of the 
remaining fiuctuation scales faster than O (/i^) but slower than O (ft."^) , as we expected. If scaled back to the step size 
of 0.5 fsec used in Fig.l this fluctuation can account for only a few percent of the whole amplitude, which indicates 
that the total energy is really much better conserved than it appears to be. It is interesting that the high-frequency 
harmonic oscillation that dominates in Figs.l (b-d) essentially disappears in Figs. 2d and 3d suggesting that harmonic 
motions in the system really do not contribute to the apparent fluctuation of the total energy, which is produced by 
slower anharmonic motions. 

Let us now summarize the above results. In the case of an oscillator, fluctuations are produced by interpolations only, 
their phases and amplitudes are related to those of the kinetic and potential energies in a simple way. For a realistic 
microcanonical ensemble with all common types of interactions, the total energy behaves similarly for the second and 
third order interpolations, as well as with the fourth order one at a moderately small step size. For an extremely small 
step size, however, a qualitative difference is observed between the fourth and lower order interpolations. All this 
agrees very well with the global error in the total energy between O (r^) and O (t^) corresponding to O (t'^) local 
error in the trajectory. Although the trajectory contribution can be revealed by interpolations of the fourth and higher 
orders, with practical time step values, the interpolation errors always dominate overwhelmingly. Taking into account 
that, because of the rapid growth of higher derivatives on repulsive walls of non- bonded potentials, higher order 
interpolations tend to be less accurate in "less harmonic" systems, we are obliged to conclude that straightforward 
interpolations in principle cannot adequately evaluate instantaneous on-step velocities and kinetic energies in the case 
of the leapfrog scheme. This approach, therefore, should not be used for assessing the accuracy of the algorithm. 
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D. Alternative Strategy for Checking Energy Conservation 



In this section we consider how numerical tests with microcanonical ensembles can be arranged so that the real 

accuracy of conservation of the total energy can be assessed. The main idea is simple. Note that all interpolations of 
the kinetic energy such as Eqs. (28) and (30) give the same average kinetic energy. For any sufficiently long analytical 
trajectory these interpolations, upon averaging, result in a correct average kinetic energy and, consequently, exact 
total energy. This property does not depend upon the specific form of the Hamiltonian or the number of degrees 
of freedom. Note, for example, that in Figs. 1-3 with the same step size, Eif, E3 and E4 oscillate around the 
same average. At the same time there is a distinguishable difference between the averages for the three given values 
of the step size. We see, therefore, that unlike instantaneous energies, their time averages appear to be free from 
interpolation errors and they represent adequate indicators of the accuracy of the energy conservation. 

Suppose we know the analytical value of the total energy. We could then repeatedly calculate a long enough test 
trajectory with gradually growing time steps and compare the average total energy with the analytical one. As long 
as these two values are close to each other, the computed trajectory remains close to a correct hypersurface in phase 
space. Instead of the unknown analytical total energy, we can use that obtained for the same trajectory with a very 
small step size. It is important to make sure, however, that the test trajectory starts from the same state on a fixed 
constant-energy hypersurface with each step size, which is not automatic in the case of the leapfrog scheme. Let us 
consider the implementation of this testing procedure for a harmonic oscillator. 

By using Eqs. (22a) and (24) one can derive an identity 



e 1 
2 + 2 



-c/„ 



(32) 



Denoting 



(33) 



we obtain 



Un = 



2(1-^) 



(e + Sn) 



(34) 



It can be shown by a straightforward solution of the finite difference equations (22a, b), that within the time step 
range of stability of an oscillator U = K. Therefore 6n presents an oscillation around a zero average and we have 



Eif = 2U = 



(35) 



Let us now check how accurately the numerical average total energy approximates exact values, that is total energies 
corresponding to constant-energy hypersurfaces sampled by a leapfrog trajectory. Consider an analytical trajectory 
passing through a leapfrog state ('',1-1/2! ^^n) . We have 



e = Ea< sin ut + cos co I t 



h 



cos (J t 



■ T sin ojt 



(36) 



From Eqs. (35) and (36) by using a Taylor series expansion, we obtain 



Elf 



sin 2LL>t 



^4 ^6 

cos2a;t \-0(t^ 

96 64 ^ 



(37) 



The r.h.s. of Eq. (37) involves two qualitatively different types of deviations. The first depends upon the current 
phase of the oscillation and it in fact presents the true measure of the energy conservation along the trajectory: as long 
as the phase-dependent terms in Eq. (37) are small, all states on the trajectory belong, or are close, to the same exact 
constant energy hypersurface. When these terms become too large, successive leapfrog states correspond to different 
hypersurfaces, but the corresponding analytical total energies oscillate around a certain value. The second type of 
deviation is phase-independent and it is presented by the fourth and sixth-order terms. Because of this deviation, 
with large step size all leapfrog states appear to belong to hypersurfaces of a systematically lower energy than the 
computed average value, which means that they are no longer sampled ergodically. The magnitudes of both these 
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types of deviations can be easily evaluated, and it turns out that for r < 1.1 the fluctuating part dominates, while 
above this level the phase-independent contribution rapidly becomes overwhelming due to its sixth order growth. 

Now consider what one would observe in the numerical tests outlined above. It can be seen from Eq. (35) that by 
simply increasing the step size one obtains an 0{t) regular growth of U and Eif. It is clear, however, that, in this 
case, the starting leapfrog state in phase space is effectively moved from one constant energy hypersurface to another, 
which results in a regular drift of averages. In order to remove this drift, the trajectory should start from the same 
hypersurface, that is from xq and w_i/2 corresponding to the same analytical trajectory. It is easy to see that this 
case is also described by Eq. (37) if it is read in an opposite sense. Namely, now Ea and the phase of the oscillating 
terms on the right are constant because they correspond to the initial constant energy hypersurface and the phase of 
the starting state. Equation (37) therefore describes the step size dependence of Eif which beyond r « 1.1 should 
rapidly deviate upward from its zero step size limit. It turns out that this estimate holds quite well for representative 
molecular models as well, which is illustrated by Fig. 4. 

This figure presents an example of application of the proposed test to protein dynamics. The simulations were made 
by AMBER molecular modeling program with AMBER94 force field ||2|] . The system considered was same as in 
Figs. 1-3, that is with only water bond lengths and bond angles constrained by the SHAKE algorithm |^^. Initial data 
for these tests were prepared as follows. The equilibration was more or less standard and included minimization of the 
crystal structure followed by a 12.5 psec run starting from Maxwell distribution at 300K with periodic temperature 
control and a step size of 0.5 fsec. After that the step size was reduced to 0.25 fsec and a short trajectory of 150 fsec 
was calculated, with the final part stored and used in place of an analytical trajectory for generating initial data. The 
last point was used as the starting state for a 10 psec test trajectory which was computed with gradually growing 
step sizes and half-step velocities taken at appropriate time intervals from coordinates. 

In order to apply our test to a real molecular system we should also take into account that, since in common empirical 
potentials the absolute value of the potential energy involves many non-harmonic contributions which always remain 
far from zero, it is no longer sensible to compare the deviations with the absolute energy values. An appropriate 
natural scale, however, is given by the variance of the instantaneous potential energy over the test trajectory. One 
may reasonably consider for U the deviation of zbO.lD [U], for instance, as an acceptable level of accuracy. Note that 
in the oscillator case this gives approximately same criterion as above. The deviation of Ei f should be two times 
larger because it involves a similar contribution from the kinetic energy. 

Figure 4 (a) shows the step size dependence of D [U] . The dotted line in this figure shows the low step size level used 
to plot the corresponding bands of acceptable accuracy in Figs. 4 (b) and (c). As seen in Fig. 4(b) the fluctuation 
of U remains within the band ±0.1D [U] up to a time step of 1.7 fsec. These fluctuations are relatively large because 
they are affected by rare conformational transitions which cannot be averaged during a test trajectory. Oppositely, 
El f shown in Fig. 4 (c) grows steadily, but its deviation also remains within the band of acceptable accuracy up to 
the same step size. Assuming that this value corresponds to r « 1.1 one obtains a frequency of 3400 cm"-'^, that is 
exactly that of bond stretching of peptide hydrogens. The whole spectrum of bond stretching modes of hydrogens 
covers the range 3000-3700 cm~^, which corresponds to characteristic step sizes in the range 1.6-1.9 fsec. We may 
conclude, therefore, that Fig. 4 demonstrates a good agreement with the single oscillator model considered above. 

At this point it is convenient to compare our step size estimates with the common recommendations. Although 
no strict rules exist, it is usually considered that the computed trajectory provides an acceptably accurate sampling 
when the relative fluctuation qif given by Eq. (26) is less than 0.1, which gives r ^ 0.4 , i.e. at least 14 time steps 
per a single cycle of the oscillation. Our estimates show that actually this level of accuracy is reached with an almost 
three times larger time step, although in this case qif is already so large that the energy conservation seems to be 
lost. In practice, however, the above recommendations are almost never respected and most often molecular dynamics 
trajectories are computed with less than 10, or even less than 5 steps, per the shortest cycle [Q. This practice thus 
appears to be well justifled and it is clear from Fig. 4 that the leapfrog trajectory really manages to hold to a constant 
energy hypersurface with somewhat larger time steps than commonly recommended. 



E. Leapfrog Scheme Versus Its Relatives 



In this section we will demonstrate that, contrary to the conventional point of view the leapfrog scheme exhibits 
exceptional properties compared with the other algorithms of this group. 

The results obtained in Section II C above obviously are not applicable to the velocity Verlet and Beeman algorithms, 
Eqs. (4-5). In these cases, interpolations appear to be built into the algorithms, which makes them only O (ft."^) 
accurate in velocities. (Note that interpolation (7) has the same order of truncation as that in Eq. (6)). Oscillations 
of the instantaneous total energy result from the algorithms themselves and they grow simply as O (/i^) in agreement 
with the order of the truncation error. Let us look more thoroughly at the total energy computed by the velocity 
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Verlet integrator. Similarly to Eq.(25) we obtain for the oscillator 




n 



(38) 



We see that E^y behaves similarly to Ei f with two times smaller amplitude of the oscillation and always Eyy < Eif, 
which means that, in the case of velocity Verlet, the temperature estimated for the same trajectory is somewhat lower. 
In real simulations Qyy is normally smaller by a factor of 0.5-0.7 and the corresponding temperature can be lower 
by as much as a few degrees. The origin of these differences is clear from the previous discussion and it is not very 
interesting. 

Since Eij and Ey^ are O (r^) different it is clear that Em gives one order less accurate estimate of the total energy 
of the analytical trajectory approximated by leapfrog states. The same order of approximation is obtained if we 
consider the trajectories passing through velocity Verlet states. In this case the analytical energies are given simply 
by Eq. (38), and we have 



which shows that successive states belong to hypersurfaces of O(t^) different energies. If we consider Ea and J7„ in 
Eq. (|3^) as constants and Eyy and r as variables we obtain the rate of growth of the deviation of Eyy with the step 
size. We see that it is on the order of O(t^), i.e. similar to the leapfrog scheme with fixed e according to Eq. (35). 
All these derivations can be readily reproduced for Beeman and the original Stormer algorithms which both deviate 
from their starting hypersurfaces as 0{t^). 

The above results can be summarized as follows. Consider a numerical trajectory of an oscillator in phase space. 
It may be represented by Stormer- Verlet states (a;„,a;„_i), velocity Verlet states (a;„,u„) or by leapfrog states 
(a;„, u„_i/2)- According to Eq. (37), the leapfrog states sample from constant energy hypersurfaces which are 0{t^) 
close to each other, while in the former two cases a broader O(t^) spectrum of energies is covered. In this sense one 
can say that the leapfrog trajectory presents the most accurate sampling among the three representations. 



The results presented in the paper shed some light upon the long-standing confusion involved in the conventional 
interpretation of the common algorithms of the classical molecular dynamics. The main practical result is a new 
simple testing scheme for energy conservation, which gives a better estimate of the quality of leapfrog trajectories. It 
is shown that the leapfrog trajectories provide accurate sampling in phase space with large time step values, when the 
apparent total energy obtained by routine interpolations is no longer conserved. This explains the recognized quality 
of thermodynamic averages at large step size. The new testing scheme should be particularly useful for studying the 
time step limitations in various molecular models appearing in internal coordinate molecular dynamics simulations of 
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FIG. 1. Time dependence of the potential energy (a) and the total energy (c-d) for an unconstrained protein molecule. All 
energies are in kcal/mole. The molecular dynamics trajectory was computed with the leapfrog algorithm and a time step of 

0.5 fsoc. Throe different estimates of the total energy for figures (c)-(d) were obtained with interpolations of second, third and 
fourth order, respectively, applied to half-step kinetic energies. The average total energy subtracted from the instantaneous 
values in figures (c)-(d) was 254.68 kcal/mole. 

FIG. 2. Same as Fig. 1, but with the molecular dynamics trajectory computed with a time step of 0.1 fsec. The average 
total energy in Figs, (c)-(d) was 255.42 kcal/mole. 



FIG. 3. Same as Fig. 1, but with molecular dynamics trajectory computed with a time step of 0.05 fsec. The average total 
energy in Figs, (c)-(d) was 255.51 kcal/mole. 

FIG. 4. Time step depondoncios of the potential energy (b), its time variance (a) and the total energy (c) computed over 
a 10 psec molecular dynamics trajectory. All energies are in kcal/mole. The dotted lines in figures (b) and (c) show the 
corresponding bands of acceptable deviation defined as described in the text from the low time step limit of variance indicated 
by the dotted line in figure (a). 
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